Image Processing

By: Chuang-Jan Chang

Ming Chi University of Technology, Taiwan

Outline:

  • Getting started with Image Processing
    • What is image processing
    • The image processing pipeline
  • Morphological Image Processing
  • Affine tranformation

Getting started with Image Processing

As the name suggests, image processing can simply be defined as the processing (analyzing and manipulating) of images with algorithms in a computer (through code). It has a few different aspects, such as storage, representation, information extraction, manipulation, enhancement, restoration, and interpretation of images.

What is an image and how it is stored on a computer

Conceptually, an image in its simplest form (single-channel; for example, binary or mono-chrome, grayscale or black and white images) is a two-dimensional function f(x,y) that maps a coordinate-pair to an integer/real value, which is related to the intensity/color of the point. Each point is called a pixel or pel (picture element). An image can have multiple channels too (for example, colored RGB images, where a color can be represented using three channels—red, green, and blue). For a colored RGB image, each pixel at the (x,y) coordinate can be represented by a three-tuple (rx,y, gx,y, bx,y).

In order to be able to process it on a computer, an image f(x,y) needs to be digitalized both spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling. Amplitude digitization is called gray-level quantization. In a computer, a pixel value corresponding to a channel is generally represented as an integer value between (0-255) or a floating-point value between (0-1). An image is stored as a file, and there can be many different types (formats) of files. Each file generally has some metadata and some data that can be extracted as multi-dimensional arrays (for example, 2-D arrays for binary or gray-level images and 3D arrays for RGB and YUV colored images).

Single chanel image: Binary image and grayscale image image.png

Multi chanel image also call RGB Image requires a 3-D array of a dimension of width x height x 3 image.png

example binary, grayscale, and RGB images img2.png

What is image processing?

Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on a computer. It has applications in many disciplines and fields in science and technology such as television, photography, robotics, remote sensing, medical diagnosis, and industrial inspection. Social networking sites such as Facebook and Instagram, which we have got used to in our daily lives and where we upload tons of images every day, are typical examples of the industries that need to use/innovate many image processing algorithms to process the images we upload.In this book, we are going to use a few Python packages to process an image. First, we shall use a bunch of libraries to do classical image processing: right from extracting image data, transforming the data with some algorithms using library functions to pre-process, enhance, restore, represent (with descriptors), segment, classify, and detect and recognize (objects) to analyze, understand, and interpret the data better. Next, we shall use another bunch of libraries to do image processing based on deep learning, a technology that has became very popular in the last few years.

Some typical applications of image processing include medical/biological fields (for example, X-rays and CT scans), computational photography (Photoshop), fingerprint authentication, face recognition, and so on.

The image processing pipeline

image.png

Reading an image

In [2]:
im = Image.open("./assets/parrot.jpg")
print(im.size)
(453, 340)

Displaying an image

In [3]:
plt.imshow(im)
plt.axis('off')
plt.show()

Saving an image

In [5]:
print("Before saving image:" + str(os.listdir(os.getcwd()))) 
img_gray = im.convert("LA")

filename = 'grayscale.png'
img_gray.save(filename) 
print("After saving image:" + str(os.listdir(os.getcwd())))   

plt.imshow(img_gray)
plt.axis('off')
plt.show()
Before saving image:['teaching_material_week_1.ipynb', '.ipynb_checkpoints', 'assets']
After saving image:['grayscale.png', 'teaching_material_week_1.ipynb', '.ipynb_checkpoints', 'assets']

Draw on image

We can draw lines or other geometric shapes on an image (for example, the ellipse() function to draw an ellipse)

In [6]:
img = im.copy(); draw = ImageDraw.Draw(img)
draw.ellipse((150, 125, 220, 250), fill=(255,255,255,128)); del draw
plt.figure(figsize=(10,5))
plt.subplot(1,2,1); plt.imshow(im); plt.title("Original Image", size =20 ); plt.axis('off')
plt.subplot(1,2,2); plt.imshow(img); plt.title("Image After Draw Object", size =20 ); plt.axis('off')
plt.tight_layout(); plt.show()

Drawing text on image

We can add text to an image

In [7]:
im_text = im.copy()
add_text = ImageDraw.Draw(im_text)
font = ImageFont.truetype("LiberationSans-Regular.ttf", 23) # use a truetype font
add_text.text((10, 5), "Hallo Draw image", font=font)
plt.figure(figsize=(10,5))
plt.subplot(1,2,1); plt.imshow(im); plt.title("Original Image", size =20 ); plt.axis('off')
plt.subplot(1,2,2); plt.imshow(im_text); plt.title("Image With Text", size =20 ); plt.axis('off')
plt.tight_layout(); plt.show()

Blending image

In [8]:
im1 = Image.open("./assets/parrot.png")
im2 = Image.open("./assets/hill.png")
# two images have different modes, must be converted to the same mode
im1 = im1.convert('RGBA') 
# two images have different sizes, must be converted to the same size
im2 = im2.resize((im1.width, im1.height), Image.BILINEAR) 
im_blend = Image.blend(im1, im2, alpha=0.5)
plt.figure(figsize=(18,6))
plt.subplot(1,3,1); plt.imshow(im1); plt.title("Image 1", size =20 ); plt.axis('off')
plt.subplot(1,3,2); plt.imshow(im2); plt.title("Image 2", size =20 ); plt.axis('off')
plt.subplot(1,3,3); plt.imshow(im_blend); plt.title("Blending Image", size =20 );plt.axis('off')
plt.tight_layout(); plt.show()

Resize image

In [9]:
size_ori = str((im.width, im.height))
im_small = im.resize((im.width//5, im.height//5), Image.ANTIALIAS)
size_small = str((im_small.width, im_small.height))
plt.figure(figsize=(14,6))
plt.subplot(1,2,1); plot_image(im, title='Size ori ' + size_ori) ; plt.axis('on')
plt.subplot(1,2,2); plot_image(im_small, title='Size small ' + size_small) ; plt.axis('on')
plt.tight_layout(); plt.show()

Crop Image

In [10]:
im_c = im.crop((175,75,320,200)) 
# crop the rectangle given by (left, top, right, bottom) from the image
# plt.imshow(im_c) ; plt.axis("off"); plt.show()
plt.figure(figsize=(10,5))
plt.subplot(1,2,1); plot_image(im, title='Original Image ' ) ; plt.axis('on')
plt.subplot(1,2,2); plot_image(im_c, title='Cropping Image ') ; plt.axis('on')
plt.tight_layout(); plt.show()

Separating the RGB channels of an image

In [11]:
ch_r, ch_g, ch_b = im.split() # split the RGB image into 3 channels: R, G and B
# we shall use matplotlib to display the channels
plt.figure(figsize=(18,6))
plt.subplot(1,3,1); plt.imshow(ch_r, cmap=plt.cm.Reds); plt.axis('off')
plt.subplot(1,3,2); plt.imshow(ch_g, cmap=plt.cm.Greens); plt.axis('off')
plt.subplot(1,3,3); plt.imshow(ch_b, cmap=plt.cm.Blues); plt.axis('off')
plt.tight_layout()
plt.show() # show the R, G, B channels

Applying the swirl transform

This is a non-linear transform defined in the scikit-image documentation. The next code snippet shows how to use the swirl() function to implement the transform, where strength is a parameter to the function for the amount of swirl, radius indicates the swirl extent in pixels, and rotation adds a rotation angle. The transformation of radius into r is to ensure that the transformation decays to ≈ 1/1000th ≈ 1/1000th within the specified radius

In [12]:
im = skim.io.imread("./assets/parrot.png")
swirled = tr.swirl(im, rotation=0, strength=10, radius=200)
plt.figure(figsize=(10,5))
plt.subplot(1,2,1); plot_image(im, title='Original Image ' ) 
plt.subplot(1,2,2); plot_image(swirled, title='Swirled Image ') 
plt.tight_layout(); plt.show()

Extracting the boundary

The erosion operation can be used to extract the boundary of a binary image—we just need to subtract the eroded image from the input binary image to extract the boundary. The following code block implements this

In [13]:
im = rgb2gray(imread('./assets/horse-dog.jpg'))
threshold = 0.5
im[im < threshold] = 0; im[im >= threshold] = 1
boundary = im - binary_erosion(im)
plot_images_horizontally(im, boundary, 'boundary',sz=(10,5))
<Figure size 432x288 with 0 Axes>

Mophological Image Processing

Morphological image processing is a collection of non-linear operations related to the shape or morphology of features in an image. These operations are particularly suited to the processing of binary images (where pixels are represented as 0 or 1 and, by convention, the foreground of the object = 1 or white and the background = 0 or black), although it can be extended to grayscale images.

In morphological operations, a structuring element (a small template image) is used to probe the input image. The algorithms work by positioning the structuring element at all possible locations in the input image and comparing it with the corresponding neighborhood of the pixels with a set operator. Some operations test whether the element fits within the neighborhood, while others test whether it hits or intersects the neighborhood. A few popular morphological operators or filters are binary dilation and erosion, opening and closing, thinning, skeletonizing, morphological edge detectors, hit or miss filters, rank filters, median filters, and majority filters.

1. Erosion

In [14]:
im = rgb2gray(imread('./assets/clock2.jpg'))
im[im <= 0.5] = 0 # create binary image with fixed threshold 0.5
im[im > 0.5] = 1
pylab.gray()
pylab.figure(figsize=(20,10))
pylab.subplot(1,3,1), plot_image(im, 'original')
im1 = binary_erosion(im, rectangle(1,5))
pylab.subplot(1,3,2), plot_image(im1, 'erosion with rectangle size (1,5)')
im1 = binary_erosion(im, rectangle(1,15))
pylab.subplot(1,3,3), plot_image(im1, 'erosion with rectangle size (1,15)')
pylab.show()
<Figure size 432x288 with 0 Axes>
In [15]:
im = rgb2gray(imread('./assets/zebras.jpg'))
struct_elem = square(12)
eroded = erosion(im, struct_elem)
plot_images_horizontally(im, eroded, 'erosion')
<Figure size 432x288 with 0 Axes>

2. Dilation

In [16]:
from skimage.morphology import binary_dilation, disk
from skimage import img_as_float
im = img_as_float(imread('./assets/tagore.png'))
im = 1 - im[...,3]
im[im <= 0.5] = 0; im[im > 0.5] = 1
pylab.gray()
pylab.figure(figsize=(18,9))
pylab.subplot(131)
pylab.imshow(im)
pylab.title('original', size=20)
pylab.axis('off')
for d in range(1,3):
    pylab.subplot(1,3,d+1)
    im1 = binary_dilation(im, disk(2*d))
    pylab.imshow(im1)
    pylab.title('dilation with disk size ' + str(2*d), size=20)
    pylab.axis('off')
pylab.show() 
<Figure size 432x288 with 0 Axes>
In [17]:
dilated = dilation(im, struct_elem)
plot_images_horizontally(im, dilated, 'dilation')
<Figure size 432x288 with 0 Axes>

3. Opening

In [18]:
from skimage.morphology import binary_opening, binary_closing, binary_erosion, binary_dilation, disk
im = rgb2gray(imread('./assets/circles.jpg'))
im[im <= 0.5] = 0
im[im > 0.5] = 1
pylab.gray()
pylab.figure(figsize=(15,7))
pylab.subplot(1,3,1), plot_image(im, 'original')
im1 = binary_opening(im, disk(12))
pylab.subplot(1,3,2), plot_image(im1, 'opening with disk size ' + str(12))
im1 = binary_closing(im, disk(6))
pylab.subplot(1,3,3), plot_image(im1, 'closing with disk size ' + str(6))
pylab.show()
<Figure size 432x288 with 0 Axes>
In [19]:
opened = opening(im, struct_elem)
plot_images_horizontally(im, opened, 'opening')
<Figure size 432x288 with 0 Axes>

4. Closing

In [20]:
closed = closing(im, struct_elem)
plot_images_horizontally(im, closed, 'closing')
<Figure size 432x288 with 0 Axes>

5. Removing small objects

In [21]:
im = rgb2gray(imread('./assets/circles.jpg'))
im[im > 0.5] = 1 # create binary image by thresholding with fixed threshold
0.5
im[im <= 0.5] = 0
im = im.astype(np.bool)
pylab.figure(figsize=(8,8))
pylab.subplot(2,2,1), plot_image(im, 'original')
i = 2
for osz in [50, 200, 500]:
    im1 = remove_small_objects(im, osz, connectivity=1)
    pylab.subplot(2,2,i), plot_image(im1, 'removing below size ' + str(osz))
    i += 1
pylab.show()

Affine Transformation

Affine Transformation helps to modify the geometric structure of the image, preserving parallelism of lines but not the lengths and angles. It preserves collinearity and ratios of distances. It is one type of method we can use in Machine Learning and Deep Learning for Image Processing and also for Image Augmentation. This technique is also used to correct Geometric Distortions and Deformations that occur with non-ideal camera angles. Ex: Satellite Imagery.

The Affine Transformation relies on matrices to handle rotation, shear, translation and scaling.[1]

image.png

1. Translation

A translation is a function that moves every point with a constant distance in a specified direction. In TensorFlow, it is specified as tx and ty which will provide the orientation and the distance.

  • tx: Width shift.
  • ty: Heigh shift.

So after specifying the values of tx and ty, we can get our desired result.

image.png

In [23]:
img = cv.imread('./assets/image_1.png')
transformation = tf.keras.preprocessing.image.apply_affine_transform(img, tx=40, ty=60 )
transformation = cv.cvtColor(transformation, cv.COLOR_BGR2RGB)
plt.imshow(transformation);plt.axis('off')
Out[23]:
(-0.5, 256.5, 251.5, -0.5)

2. Rotation

Rotation is a circular transformation around a point or an axis. We can specify the angle of rotation to rotate our image around a point or an axis.

image.png

We can mention the value of theta in degrees for implementing in Tensorflow.

  • theta: Rotation angle in degrees.
In [24]:
transformation = tf.keras.preprocessing.image.apply_affine_transform(img,theta=270)
transformation = cv.cvtColor(transformation, cv.COLOR_BGR2RGB)
plt.imshow(transformation)
plt.axis('off'); plt.show()

3. Scaling

Scaling is a linear transformation that enlarges or shrinks objects by a scale factor that is the same in all directions. We can specify the values of the sx and sy to enlarge or shrink our images. It is basically zooming in the image or zooming out the image. image.png

  • zx: Zoom in x direction.
  • zy: Zoom in y direction
In [25]:
transformation = tf.keras.preprocessing.image.apply_affine_transform(img, zx=0.5, zy= 0.5 )
transformation = cv.cvtColor(transformation, cv.COLOR_BGR2RGB)
plt.axis('off'); plt.imshow(transformation)
plt.show()

4. Shear

Shear is sometimes also referred to as transvection. A transvection is a function that shifts every point with constant distance in a basis direction(x or y). image.png

In [26]:
transformation = tf.keras.preprocessing.image.apply_affine_transform(img, shear=25)
transformation = cv.cvtColor(transformation, cv.COLOR_BGR2RGB)
plt.imshow(transformation)
plt.axis("off"); plt.show()

Next lecture

  • Face recognition
  • Github for development project

Thank you

In [ ]: